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Abstract 

Given  an  m  x  n  matrix  A  and  a  positive  integer  k,  we  introduce  a  randomized 
procedure  for  the  approximation  of  A  with  a  matrix  Z  of  rank  k.  The  procedure  relies 
on  applying  to  a  collection  of  I  random  vectors,  where  I  is  an  integer  equal  to  or 
slightly  greater  than  k;  the  scheme  is  efficient  whenever  A  and  A"^  can  be  applied  rapidly 
to  arbitrary  vectors.  The  discrepancy  between  A  and  Z  is  of  the  same  order  as  the 
(fc+l)®*  greatest  singular  value  ak+i  of  A,  with  negligible  probability  of  even  moderately 
large  deviations.  The  actual  estimates  derived  in  the  paper  are  fairly  complicated,  but 
are  simpler  when  I  —  k  is  a  fixed  small  nonnegative  integer.  For  example,  according  to 
one  of  our  estimates  for  I  —  k  =  20,  the  probability  that  the  spectral  norm  ||A  —  Z\\  is 
greater  than  10  p(/c  +  20)  m  ak+i  is  less  than  10“^’^.  The  paper  contains  a  number  of 
estimates  for  ||A  — Z||,  including  several  that  are  stronger  (but  more  detailed)  than  the 
preceding  example;  some  of  the  estimates  are  effectively  independent  of  m.  Thus,  given 
a  matrix  A  of  limited  numerical  rank,  such  that  both  A  and  A"^  can  be  applied  rapidly 
to  arbitrary  vectors,  the  scheme  provides  a  simple,  efficient  means  for  constructing 
an  accurate  approximation  to  a  Singular  Value  Decomposition  of  A.  Furthermore,  the 
algorithm  presented  here  operates  reliably  independently  of  the  structure  of  the  matrix 
A.  The  results  are  illustrated  via  several  numerical  examples. 

1  Introduction 

In  many  practical  circumstances,  it  is  desirable  to  approximate  a  matrix  A  with  a  sum  of 
rank-1  matrices.  Such  an  approximation  of  A  often  facilitates  understanding  of  the  properties 
of  A.  Moreover,  if  the  approximation  involves  only  a  small  number  of  rank-1  matrices,  then 
the  approximation  also  facilitates  rapid  calculations  involving  A. 

There  are  at  least  two  classical  forms  of  such  matrix  approximations.  One  is  an  ap¬ 
proximation  to  a  Singular  Value  Decomposition  (SVD),  which  is  known  in  the  statistical 
literature  as  a  Principal  Component  Analysis.  The  other  is  an  approximation  obtained  via 
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subset  selection;  we  will  refer  to  the  matrix  representation  obtained  via  subset  selection  as 
an  interpolative  decomposition.  These  two  types  of  matrix  approximations  are  dehned  as 
follows. 

An  approximation  to  an  SVD  of  a  real  m  x  n  matrix  A  consists  of  nonnegative  real 
numbers  Ui,  (T2,  ...,  (Jk-i,  Uk  known  as  singular  values,  orthonormal  real  m  x  1  column 
vectors  ...,  known  as  left  singular  vectors,  and  orthonormal  real  n  x  1 

column  vectors  . . . ,  known  as  right  singular  vectors,  such  that 


i=i 


a  4 


(1) 


where  k,  m,  and  n  are  positive  integers,  5  is  a  positive  real  number  specifying  the  precision 
of  the  approximation,  and,  for  any  matrix  B,  ||i?||  denotes  the  spectral  (/^-operator)  norm 
of  B,  that  is,  ||i?||  is  the  greatest  singular  value  of  B.  An  approximation  to  an  SVD  of  A  is 
often  written  in  the  equivalent  form 


||A- VS  V^ll  <  5,  (2) 

where  V  is  a  real  mxk  matrix  whose  columns  are  orthonormal,  V  is  a  real  nxk  matrix  whose 
columns  are  orthonormal,  and  S  is  a  real  k  x  k  matrix  whose  entries  are  all  nonnegative  and 
whose  entries  off  of  the  main  diagonal  are  zero.  See,  for  example,  [21]  for  a  discussion  of 
SVDs. 

An  interpolative  decomposition  of  a  real  m  x  n  matrix  A  consists  of  a  real  mxk  matrix 
B  whose  columns  constitute  a  subset  of  the  columns  of  A,  and  a  real  k  x  n  matrix  P,  such 
that 

1.  some  subset  of  the  columns  of  P  makes  up  the  k  x  k  identity  matrix, 

2.  no  entry  of  P  has  an  absolute  value  greater  than  2,  and 

3.  A  =  BP. 

See,  for  example,  [20],  [6],  [16],  [25],  or  Sections  4  and  5  of  [4]  for  a  discussion  of  interpolative 
decompositions. 

Given  an  algorithm  permitting  the  fast  application  of  a  numerically  low-rank  matrix  A, 
and  an  algorithm  permitting  the  fast  application  of  A"*",  the  algorithm  of  the  present  paper 
provides  a  simple,  efficient  way  for  computing  an  accurate  approximation  to  an  SVD  of  A. 
Moreover,  the  algorithm  provides  a  similar  method  for  computing  an  accurate  approximation 
to  an  interpolative  decomposition  of  A  under  the  same  conditions. 

Our  scheme  also  provides  an  efficient,  robust  means  for  approximating  the  k  greatest 
singular  values  and  corresponding  singular  vectors  of  any  matrix  A  for  which  a  representation 
enabling  the  fast  application  of  both  A  and  is  available.  The  precision  5  of  the  resulting 
approximation  given  by  formula  (2)  is  at  most  a  reasonably  small  multiple  of  the  {k  -|- 
l)st  siugulur  value  of  A.  In  this  regard,  the  algorithm  described  below  should  be 

compared  to  the  classical  Lanczos  method  (for  a  description  of  the  Lanczos  method,  see,  for 
example.  Chapter  9  in  [15]). 
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Unlike  the  deterministic  Lanczos  scheme,  the  algorithm  of  the  present  paper  is  a  ran¬ 
domized  one,  and  fails  with  a  rather  negligible  probability.  Examples  of  the  probabilities 
involved  can  be  found  in  (103)  and  (115)  in  Section  4  below. 

Some  potential  applications  of  the  algorithm  include  Ending  the  eigenmodes  of  certain 
networks,  mining  digital  documents  for  information  via  latent  semantic  analysis,  computing 
electron  densities  within  the  density  functional  theory  of  quantum  chemistry,  simplifying 
the  implementation  of  algorithms  for  fast  matrix  inversion  that  are  based  on  the  compres¬ 
sion  of  blocks  within  matrices,  and  improving  condition  number  estimation  and  subspace 
determination  algorithms  that  are  based  on  inverse  iteration. 

We  should  point  out  that  [19]  and  [4]  motivated  many  aspects  of  the  algorithm  and 
analysis  of  the  present  paper.  We  would  also  like  to  highlight  [16],  [25],  [18],  and  [17], 
which  came  to  our  attention  after  we  had  released  the  initial  version  of  the  present  work. 
Moreover,  a  number  of  recent  publications  address  issues  similar  to  those  addressed  by  the 
present  paper;  we  refer  the  reader  to  [24]  and  [2],  which  describe  deterministic  methods, 
and  to  [1],  [7],  [8],  [9],  [10],  [11],  [12],  [13],  [22],  [23],  and  the  extensive  references  contained 
therein,  all  of  which  describe  probabilistic  Monte  Carlo  methods. 

We  do  not  analyze  in  detail  the  efiects  of  round-ofi  upon  the  algorithm  of  the  present 
paper.  However,  most  of  the  bounds  that  we  discuss  have  finite-precision  analogues.  This  is 
confirmed  by  both  our  preliminary  analysis  and  our  numerical  experiments  (some  of  which 
are  described  in  Section  5  below).  For  simplicity,  we  discuss  only  real  matrices;  the  analysis 
below  extends  easily  to  the  complex  case. 

The  present  paper  has  the  following  structure:  Section  2  collects  together  various  known 
facts  which  later  sections  utilize.  Section  3  provides  the  principal  lemmas  which  Section  4  uses 
to  construct  algorithms.  Section  4  describes  the  algorithm  of  the  present  paper,  providing 
details  about  its  accuracy  and  computational  costs,  and  Section  5  illustrates  the  algorithm 
via  several  numerical  examples. 

2  Preliminaries  from  linear  algebra  and  the  theory  of 
probability 

In  this  section,  we  summarize  various  facts  about  matrices.  Subsection  2.1  discusses  the 
approximation  of  arbitrary  matrices.  Subsection  2.2  discusses  the  singular  values  of  arbitrary 
matrices.  Subsection  2.3  discusses  the  singular  values  of  certain  random  matrices. 

In  the  present  section  and  throughout  the  rest  of  the  paper,  we  employ  the  following 
notation.  In  accordance  with  the  standard  practice,  we  will  denote  the  base  of  the  natural 
logarithm  by  e.  We  will  denote  an  identity  matrix  by  1,  and  a  matrix  whose  entries  are  all 
zero  by  0.  For  any  matrix  A,  we  define  the  norm  ||yl||  of  A  to  be  the  spectral  (/^-operator) 
norm  of  A,  that  is,  ||H||  is  the  greatest  singular  value  of  A.  For  any  positive  integer  n,  and 
real  n  x  1  column  vector  v  G  R"',  we  define  the  norm  ||n||  of  n  to  be  the  root-sum-square  (/^ 
norm)  of  the  entries  of  v,  that  is. 


n 


(3) 
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where  Vk  is  the  entry  of  v.  (Of  course,  the  norm  of  v  as  viewed  as  a  real  n  x  1  matrix  is 
equal  to  the  norm  of  v  as  viewed  as  a  real  n  x  1  column  vector.) 

2.1  Approximation  of  general  matrices 

The  following  lemma  states  that,  for  any  m  x  n  matrix  A  whose  rank  is  k,  where  k,  m,  and 
n  are  positive  integers,  there  exist  an  m  x  /c  matrix  B  whose  columns  constitute  a  subset  of 
the  columns  of  A,  and  a  k  x  n  matrix  P,  such  that 

1.  some  subset  of  the  columns  of  P  makes  up  the  k  x  k  identity  matrix, 

2.  P  is  not  too  large,  and 

3.  BP  =  A. 

Moreover,  the  lemma  provides  an  analogous  approximation  B  P  to  A  when  the  exact  rank 
of  A  is  not  k,  but  the  {k  +  1)®*  singular  value  of  A  is  nevertheless  small.  The  lemma  is  a 
reformulation  of  Theorem  3.2  in  [20]  and  Theorem  3  in  [6]. 

Lemma  1  Suppose  that  m  and  n  are  positive  integers,  and  A  is  a  real  m  x  n  matrix. 

Then,  for  any  positive  integer  k  with  k  <  m  and  k  <  n,  there  exist  a  real  k  x  n  matrix 
P,  and  a  real  m  x  k  matrix  B  whose  columns  constitute  a  subset  of  the  columns  of  A,  such 
that 

1.  some  subset  of  the  columns  of  P  makes  up  the  k  x  k  identity  matrix, 

2.  no  entry  of  P  has  an  absolute  value  greater  than  1, 

3.  ||P||  <  ^/k{n  -  k)  +  I, 

4-  the  least  (that  is,  the  k^^  greatest)  singular  value  of  P  is  at  least  1, 

5.  B  P  =  A  when  k  =  m  or  k  =  n,  and 

6.  \\B  P  —  A\\  <  ^Jk(n  —  k)  +  1  (Jk+i  when  k  <m  and  k  <  n,  where  ak+i  is  the  (k  +  1)^* 
greatest  singular  value  of  A. 

Remark  2  Properties  1,  2,  3,  and  4  in  Lemma  1  ensure  that  the  interpolative  decomposition 
B  P  of  A  is  numerically  stable.  Also,  Property  3  follows  directly  from  Properties  1  and  2, 
and  Property  4  follows  directly  from  Property  1. 

Observation  3  There  exists  an  algorithm  which  computes  B  and  P  in  Lemma  1  from  A, 
provided  that  we  require  only  that 

1.  some  subset  of  the  columns  of  P  makes  up  the  k  x  k  identity  matrix, 

2.  no  entry  of  P  has  an  absolute  value  greater  than  2, 

3.  ||P||  <  a/4/c  (n  —  /c)  +  1, 
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4.  the  least  (that  is,  the  greatest)  singular  value  of  P  is  at  least  1, 

5.  B  P  =  A  when  k  =  m  or  k  =  n,  and 

6.  \\B  P  —  A||  <  y^Ak  {n  —  k)  +  1  when  k  <  m  and  k  <  n,  where  ak+i  is  the  {k  +  1)®* 
greatest  singular  value  of  A. 

For  any  positive  real  number  e,  the  algorithm  can  identify  the  least  k  such  that  ||-B  -P— ^||  ~  £• 
Furthermore,  there  exists  a  real  number  C  such  that  the  algorithm  computes  both  B  and 
P  using  at  most  Ckmn  log(n)  floating-point  operations  and  Ckmn  floating-point  words  of 
memory.  The  algorithm  is  based  upon  the  Cramer  rule  and  the  ability  to  obtain  the  minimal- 
norm  (or  at  least  roughly  minimal-norm)  solutions  to  linear  algebraic  systems  of  equations 
(see  [20],  [6],  and  [19]). 

The  following  lemma  provides  an  approximation  Q  S  to  an  n  x  I  matrix  R  via  an  n  x  k 
matrix  Q  whose  columns  are  orthonormal,  and  a  k  x  I  matrix  S. 

Lemma  4  Suppose  that  k,  I,  and  n  are  positive  integers  with  k  <  I  and  I  <  n,  and  R  is  a 
real  n  x  I  matrix. 

Then,  there  exist  a  real  n  x  k  matrix  Q  whose  columns  are  orthonormal,  and  a  real  k  x  I 
matrix  S,  such  that 

\\Q  S  -  R\\  <  (4) 

where  is  the  {k  +  1)®*  greatest  singular  value  of  R. 

Proof.  We  start  by  forming  an  SVD  of  R, 

R  =  UT.V^,  (5) 

where  U  is  a  real  n  x  I  matrix  whose  columns  are  orthonormal,  P  is  a  real  I  x  I  matrix 
whose  columns  are  orthonormal,  and  S  is  a  real  I  x  I  matrix  whose  entries  are  nonnegative 
everywhere  and  zero  off  of  the  main  diagonal,  such  that 

Tjj  =  pj  (6) 

for  all  j  =  1,  2,  . . . ,  /  —  1,  /,  where  Tjj  is  the  entry  in  row  j  and  column  j  of  S,  and  pj  is 
the  greatest  singular  value  of  R.  We  dehne  Q  to  be  the  leftmost  n  x  k  block  of  U ,  and 
P  to  be  the  rightmost  n  x  [I  —  k)  block  of  U,  so  that 

U^{Q\P).  (7) 

We  dehne  S  to  be  the  uppermost  k  x  I  block  of  S  and  T  to  be  the  lowermost  {I  —  k)  x  I 
block  of  S  ,  so  that 

S  =  ([A]) .  (8) 

Combining  (5),  (6),  (7),  (8),  and  the  fact  that  the  columns  of  U  are  orthonormal,  as  are  the 
columns  of  V,  yields  (4).  □ 

Observation  5  In  order  to  compute  the  matrices  Q  and  S  in  (4)  from  the  matrix  R,  we  can 
construct  (5),  and  then  form  Q  and  S  according  to  (7)  and  (8).  (See,  for  example.  Chapter  8 
in  [15]  for  details  concerning  the  computation  of  the  SVD.) 
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2.2  Singular  values  of  general  matrices 

The  following  technical  lemma  will  be  needed  in  Section  3. 

Lemma  6  Suppose  that  m  and  n  are  positive  integers  with  m  >  n.  Suppose  further  that  A 
is  a  real  m  x  n  matrix  sueh  that  AA  A  is  invertible. 

Then, 

=  (9) 

where  is  the  least  (that  is,  the  greatest)  singular  value  of  A. 

Proof.  We  form  an  SVD  of  A, 

A  =  U^V^,  (10) 

where  is  a  real  unitary  mxm  matrix,  S  is  a  real  mxn  matrix  whose  entries  are  nonnegative 
everywhere  and  zero  off  of  the  main  diagonal,  and  P  is  a  real  unitary  nxn  matrix.  It  follows 
from  (10)  that 

{A^  A)-^  A^  =  v  (ii) 

Combining  (11)  and  the  fact  that  U  and  V  are  unitary  yields 

||(A^A)-M^||  =  ||(S^S)-^S^||  .  (12) 

Combining  (12)  and  the  fact  that  S  is  zero  off  of  the  main  diagonal  yields  (9).  □ 

The  following  lemma  provides  what  is  known  as  the  Courant-Fischer  maximin  character¬ 
ization  of  singular  values;  Theorem  8.1.2  in  [15]  provides  an  equivalent  formulation  of  (13). 


Lemma  7  Suppose  that  m  and  n  are  positive  integers,  and  A  is  a  real  mxn  matrix. 
Then,  the  greatest  singular  value  Ck  of  A  is  given  by  the  formula 


\\Av\ 

ak  =  max  mm  — ^ 

SCR":  dim 5=^  -uSS:  ||t>||7^0  U 


for  all  k  =  1,  2,  ...,  min(m,n)  —  1,  min(m, n),  where  the  maximum  is  taken  over  all  k- 
dimensional  subspaees  of  E”,  and  the  minimum  is  taken  over  all  vectors  in  S  that  have 
nonzero  norms. 


The  following  lemma  states  that  the  singular  values  of  the  product  GA  of  matrices  G 
and  A  are  at  most  ||G||  times  greater  than  the  corresponding  singular  values  of  A. 

Lemma  8  Suppose  that  I,  m,  and  n  are  positive  integers,  A  is  a  real  mxn  matrix,  and  G 
is  a  real  I  x  m  matrix. 

Then,  the  greatest  singular  value  pk  of  the  product  G  A  is  at  most  a  factor  of  ||G|| 
times  the  k^^  greatest  singular  value  ak  of  A,  that  is, 

Pfc<||G||(Tfc  (14) 

for  all  k  =  1,  2,  . . . ,  min(/,  m,  n)  —  1,  min(/,  m,  n) . 
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Proof.  For  any  vector  v  G  R"'  with  ||n||  7^  0, 


\\GAv\ 


<  IIGII 


\\Av\ 


(15) 


Combining  (13)  and  (15)  yields  (14). 


□ 


The  following  lemma  states  that  the  greatest  singular  value  of  a  matrix  A  is  at  least  as 
large  as  the  greatest  singular  value  of  any  rectangular  block  of  entries  in  A]  the  lemma  is  a 
straightforward  consequence  of  the  minimax  properties  of  singular  values  (see,  for  example, 
Section  47  of  Chapter  2  in  [26]). 


Lemma  9  Suppose  that  k,  I,  m,  and  n  are  positive  integers  with  k  <  m  and  I  <  n.  Suppose 
further  that  A  is  a  real  m  x  n  matrix,  and  B  is  a  k  x  I  rectangular  block  of  entries  in  A. 
Then,  the  greatest  singular  value  of  B  is  at  most  the  greatest  singular  value  of  A. 


The  following  lemma  states  that  the  singular  values  of  an  (n  —  1)  x  n  block  of  rows  of 
an  n  X  n  matrix  A  interlace  the  singular  values  of  A]  Corollary  8.6.3  in  [15]  provides  an 
equivalent  formulation  of  (16). 


Lemma  10  Suppose  that  n  is  a  positive  real  number,  A  is  a  real  n  x  n  matrix,  and  R  is  an 
(n  —  1)  X  n  rectangular  block  of  entries  in  A. 

Then,  the  singular  values  ai,  (J2,  . . . ,  (Jn-i,  cr-n  of  A  and  the  singular  values  pi,  p2,  ■  ■  ■ , 
Pn-2,  Pn-i  of  R  satisfy  the  inegualities 

>  Pi  >  <72  >  P2  >  •  •  •  >  <7n-2  >  Pn-2  >  <7n-l  >  Pn-1  >  <7n-  (16) 

The  following  lemma  states  that  if  the  norm  of  the  difference  of  two  matrices  is  small,  then 
their  corresponding  singular  values  are  close;  Corollary  8.6.2  in  [15]  provides  an  equivalent 
formulation  of  (17). 

Lemma  11  Suppose  that  m  and  n  are  positive  integers,  and  A  and  R  are  real mxn  matrices. 

Then,  the  k^^  greatest  singular  value  Tk  of  the  sum  A  +  R  and  the  greatest  singular 
value  at  of  A  differ  by  at  most  ||i?||,  that  is, 

kfc  -  o-fcl  <  ||i?||  (17) 

for  all  k  =  1,  2,  . . . ,  min(m,  n)  —  1,  min(m,  n) . 


2.3  Singular  values  of  random  matrices 

The  following  lemma  provides  a  highly  probable  upper  bound  on  the  greatest  singular  value 
of  a  square  matrix  whose  entries  are  independent,  identically  distributed  (i.i.d.)  Gaussian 
random  variables  of  zero  mean  and  unit  variance;  Formula  8.8  in  [14]  provides  an  equivalent 
formulation  of  the  lemma. 


7 


Lemma  12  Suppose  that  n  is  a  positive  integer,  G  is  a  real  n  x  n  matrix  whose  entries  are 
i.i.d.  Gaussian  random  variables  of  zero  mean  and  unit  varianee,  and  j  is  a  positive  real 
number,  such  that  7  >  1  and 

272 

£7^-1 

is  nonnegative. 

Then,  the  greatest  singular  value  of  G  is  at  most  v^^7  with  probability  not  less  than  the 
amount  in  (18). 

Combining  Lemmas  9  and  12  yields  the  following  lemma,  providing  a  highly  proba¬ 
ble  upper  bound  on  the  greatest  singular  value  of  a  rectangular  matrix  whose  entries  are 
i.i.d.  Gaussian  random  variables  of  zero  mean  and  unit  variance. 

Lemma  13  Suppose  that  I,  m,  and  n  are  positive  integers  with  n  >  I  and  n  >  m.  Suppose 
further  that  G  is  a  real  I  x  m  matrix  whose  entries  are  i.i.d.  Gaussian  random  variables  of 
zero  mean  and  unit  variance,  and  ■y  is  a  positive  real  number,  such  that  7  >  1  and  (18)  is 
nonnegative. 

Then,  the  greatest  singular  value  of  G  is  at  most  v^^7  with  probability  not  less  than  the 
amount  in  (18). 

The  following  lemma  provides  a  highly  probable  lower  bound  on  the  least  singular  value 
of  a  rectangular  matrix  whose  entries  are  i.i.d.  Gaussian  random  variables  of  zero  mean  and 
unit  variance;  Formula  2.5  in  [5]  and  the  proof  of  Lemma  4.1  in  [5]  together  provide  an 
equivalent  formulation  of  Lemma  14. 

Lemma  14  Suppose  that  k  and  I  are  positive  integers  with  k  <  1.  Suppose  further  that  G 
is  a  real  I  x  k  matrix  whose  entries  are  i.i.d.  Gaussian  random  variables  of  zero  mean  and 
unit  variance,  and  (3  is  a  positive  real  number,  such  that 


(I  —  k  +  1) 

is  nonnegative. 

Then,  the  least  (that  is,  the  greatest) 
probability  not  less  than  the  amount  in  (19). 

3  Mathematical  apparatus 

In  this  section,  we  describe  the  principal  tools  used  in  Section  4. 

The  following  lemma  states  that  the  product  B  P  of  matrices  B  and  P  is  a  good  approx¬ 
imation  to  a  matrix  A,  provided  that  there  exists  a  matrix  G  such  that 


{l-k  +  l)(3 


(19) 


singular  value  of  G  is  at  least  f/{\A  /3)  with 


1  - 


4  (72  —  1) 


1.  the  columns  of  B  constitute  a  subset  of  the  columns  of  A, 


3.  G -BP  is  a  good  approximation  to  G  A,  and 

4.  there  exists  a  matrix  F  such  that  ||P||  is  not  too  large,  and  FG  A  is  a  good  approxi¬ 
mation  to  A. 

Lemma  15  Suppose  that  k,  I,  m,  and  n  are  positive  integers  with  k  <  n.  Suppose  further 
that  A  is  a  real  mxn  matrix,  B  is  a  real  mx  k  matrix  whose  columns  constitute  a  subset  of 
the  columns  of  A,  P  is  a  real  k  x  n  matrix,  F  is  a  real  m  x  I  matrix,  and  G  is  a  real  I  x  m 
matrix. 

Then, 

\\BP-A\\  <  ||PGA-yl||(||P||  +  l)  +  ||P||||GPP-GA||.  (20) 

Proof.  We  observe  that 

\\BP-A\\  <\\BP-FGBP\\  +  \\FGBP-FGA\\  +  \\FGA-A\\,  (21) 

i|BP-PGBP||  <  ||B-PGB||  ||P||,  (22) 

and 

\\FGBP-FGA\\  <  ||P||  \\GBP-GA\\.  (23) 

Since  the  columns  of  B  constitute  a  subset  of  the  columns  of  A,  it  follows  that  the  columns 
of  B  —  F  G  B  constitute  a  subset  of  the  columns  of  A  —  F  G  A,  and  therefore, 

\\B-FGB\\<\\A-FGA\\.  (24) 

Combining  (21),  (22),  (23),  and  (24)  yields  (20).  □ 

Remark  16  Since  the  columns  of  B  constitute  a  subset  of  the  columns  of  A  in  Lemma  15, 
it  follows  that  the  columns  of  GP  constitute  a  subset  of  the  columns  of  G  A.  Conversely, 
whenever  a  matrix  S  is  formed  by  gathering  distinct  columns  of  GA  together  into  S,  then 
clearly  S  =  G  B  for  some  matrix  B  whose  columns  constitute  a  subset  of  the  columns  of  A. 

The  following  lemma  states  that  the  product  A  Q  of  matrices  A,  Q,  and  is  a  good 
approximation  to  a  matrix  A,  provided  that  there  exist  matrices  G  and  S  such  that 

1.  the  columns  of  Q  are  orthonormal, 

2.  Q  S  is  a  good  approximation  to  (GA)"^,  and 

3.  there  exists  a  matrix  F  such  that  ||P||  is  not  too  large,  and  FG  A  is  a  good  approxi¬ 
mation  to  A. 

Lemma  17  Suppose  that  k,  I,  m,  and  n  are  positive  integers  with  k  <  n.  Suppose  further 
that  A  is  a  real  mxn  matrix,  Q  is  a  real  n  x  k  matrix  whose  columns  are  orthonormal,  S 
is  a  real  k  x  I  matrix,  F  is  a  real  m  x  I  matrix,  and  G  is  a  real  I  x  m  matrix. 

Then, 

\\AQQ^-A\\  <2\\FGA-A\\+2\\F\\\\QS-{GAf\\.  (25) 
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Proof.  The  proof  is  straightforward,  but  tedious,  as  follows. 

We  obtain  from  the  triangle  inequality  that 

\\AQQ^  -A\\  <  \\AQQ^  -FGAQQ^W  +  \\FGAQQ^  -FGA\\  +  \\FGA-A\\.  (26) 
First,  we  provide  a  bound  for  \\AQQ^  —  F  G  AQ  Q'^W.  Clearly, 

\\AQQ^  -FGAQQ^W  <  \\A-FGA\\  ||Q||  \\Q'^\\ 

It  follows  from  the  fact  that  the  columns  of  Q  are  orthonormal  that 

IIQII  <  1 

and 

IIQ^II  <  1- 

Combining  (27),  (28),  and  (29)  yields 

\\AQQ^ -FGAQQ^W  <  \\A-FGA\\. 

Next,  we  provide  a  bound  for  \\F  G  AQ  —  F  G  A\\.  Clearly, 

WFGAQQ^ -FGA\\  <  ||F||  \\GAQQ^ -GA\\. 

It  follows  from  the  triangle  inequality  that 


\\GAQQ^-GA\\  <  \\GAQQ^-S'^Q^QQ'^\\  +  \\S^Q^QQ^-S'^Q^\ 

Furthermore, 

I  +  IISToT-gaII. 

(32) 

WGAQQ^  -s^Q^QQ^W  <  ||g||  lig^l 

|.  (33) 

Combining  (33),  (28),  and  (29)  yields 

lIGTigg^-^Tg^gg^ll  <  \\ga-s^q^\\. 

(34) 

Also,  it  follows  from  the  fact  that  the  columns  of  Q  are  orthonormal  that 

g^g  =  1. 

(35) 

It  follows  from  (35)  that 

ll^^g^gg^-A^g^ll  =0. 

(36) 

Combining  (32),  (34),  and  (36)  yields 

\\GAQQ^-GA\\  <2\\S^Q^  -GA\\. 

(37) 

Combining  (31)  and  (37)  yields 

IlFGAgg^  -  FGA\\  <  2  ||F||  \\S^  -  GA||. 

(38) 

Combining  (26),  (30),  and  (38)  yields  (25). 

□ 

The  following  lemma  states  that,  for  any  matrix  A,  and  matrix  G  whose  entries  are 
i.i.d.  Gaussian  random  variables  of  zero  mean  and  unit  variance,  with  very  high  probability 
there  exists  a  matrix  F  with  a  reasonably  small  norm,  such  that  F  G  A  is  a  good  approxi¬ 
mation  to  A. 

(27) 

(28) 

(29) 

(30) 

(31) 
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Lemma  18  Suppose  that  k,  I,  m,  and  n  are  positive  integers  with  k  <  I,  such  that  I  <  m 
and  I  <  n.  Suppose  further  that  A  is  a  real  m  x  n  matrix,  G  is  a  real  I  x  m  matrix  whose 
entries  are  i.i.d.  Gaussian  random  variables  of  zero  mean  and  unit  variance,  and  f3  and  7 
are  positive  real  numbers,  such  that  7  >  1  and 


1  - 


^y27i~(^^^~k^^T)  \{l  —  k  + 1)  p 


l—k+1 


1 


27^ 

4  (72  —  1) 


is  nonnegative. 

Then,  there  exists  a  real  m  x  I  matrix  F  such  that 


and 


\FGA-A\\  <  ^2lm/3^  72  +  1  ak+i 


||F||  < 


(39) 


(40) 


(41) 


with  probability  not  less  than  the  amount  in  (39),  where  ak+i  is  the  {k  +  1)®*  greatest  singular 
value  of  A. 


Proof.  We  prove  the  existence  of  a  matrix  F  satisfying  (40)  and  (41)  by  constructing  one. 
We  start  by  forming  an  SVD  of  A, 


A  =  UEV^,  (42) 

where  1/  is  a  real  unitary  mxm  matrix,  S  is  a  real  mxn  matrix  whose  entries  are  nonnegative 
everywhere  and  zero  off  of  the  main  diagonal,  and  P  is  a  real  unitary  n  x  n  matrix,  such 
that 

=  ai  (43) 

for  alH  =  1,  2,  . . . ,  min(m,n)  —  1,  min(m,n),  where  Sj  j  is  the  entry  in  row  i  and  column  i 
of  S,  and  a*  is  the  greatest  singular  value  of  A. 

Next,  we  dehne  auxiliary  matrices  H,  R,  and  P.  We  dehne  H  to  be  the  leftmost  /  x  k 
block  of  the  I  x  m  matrix  G  U,  and  R  to  be  the  rightmost  I  x  [m  —  k)  block  of  G  U,  so  that 

GU  =  {  H  \  R)  .  (44) 

Combining  the  facts  that  U  is  real  and  unitary,  and  that  the  entries  of  G  are  i.i.d.  Gaussian 
random  variables  of  zero  mean  and  unit  variance,  we  see  that  the  entries  of  H  are  also 
i.i.d.  Gaussian  random  variables  of  zero  mean  and  unit  variance,  as  are  the  entries  of  R.  We 
dehne  to  be  the  real  k  x  I  matrix  given  by  the  formula 

rG^)  =  H)-^  .  (45) 

We  dehne  P  to  be  the  m  x  I  matrix  whose  uppermost  k  x  I  block  is  RG^) ^  and  whose  entries 
in  the  lowermost  {m  —  k)  x  I  block  are  zero,  so  that 

^  =  (^)  ■  m 
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Finally,  we  define  F  to  be  the  m  x  I  matrix  given  by 


F=UP=U 


ij(-i) 

0 


Combining  (45),  (9),  the  fact  that  the  entries  of  H  are  i.i.d.  Gaussian  random  variables 
of  zero  mean  and  unit  variance,  and  Lemma  14  yields 


with  probability  not  less  than 


^27r(/-A;  +  l)  +  ^  ^ 

Combining  (47),  (48),  and  the  fact  that  U  is  unitary  yields  (41). 

We  now  show  that  F  dehned  in  (47)  satishes  (40). 

We  dehne  S  to  be  the  leftmost  uppermost  k  x  k  block  of  S,  and  T  to  be  the  rightmost 
lowermost  {m  —  k)  x  {n  —  k)  block  of  S,  so  that 

s  =  (4\f)  ■  (50) 


Combining  (42),  (44),  and  (47)  yields 


FGA-A=U 


Combining  (45)  and  (50)  yields 


i7(-i) 


if(-i) 


0  I 


Furthermore, 


0  I 


<  RT  ^ +  \\Tf. 


Moreover, 

RT\\  <  \\R\\  ||T||.  (54) 

Combining  (50)  and  (43)  yields 

||T||  <  a,+i.  (55) 

Combining  (51),  (52),  (53),  (54),  (55),  and  the  fact  that  U  and  V  are  unitary  yields 

\\FGA-  All  <  y^||7/(-b||2  ||7?||2  +  i  (56) 

Combining  Lemma  13  and  the  fact  that  the  entries  of  R  are  i.i.d.  Gaussian  random 
variables  of  zero  mean  and  unit  variance  shows  that 


||7?||  <  V2m  7 
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with  probability  not  less  than 


£7^-1 

Combining  (56),  (48),  and  (57)  yields  (40).  □ 

The  following  lemma  is  very  similar  to  Lemma  18.  Lemma  19  is  tighter  than  Lemma  18 
when  the  singular  values  of  the  matrix  A  decay  sufficiently  fast,  and  the  numbers  j  and  I  in 
the  lemma  are  both  much  less  than  m. 


4  (72  —  1) 


Lemma  19  Suppose  that  j,  k,  I,  m,  and  n  are  positive  integers  with  k  <  I,  such  that 
k  +  j  <  m  and  k  +  j  <  n,  as  well  as  I  <  m  and  I  <  n.  Suppose  further  that  A  is  a  real  mx  n 
matrix,  G  is  a  real  I  x  m  matrix  whose  entries  are  i.i.d.  Gaussian  random  variables  of  zero 
mean  and  unit  variance,  and  f3  and  7  are  positive  real  numbers,  such  that  7  >  1  and 


$  =  1  - 


\/27r  {I  —  k  +  1)  \{l  —  k  +  1) /3 

1 


27" 


2  \  max(m— fc— jf, 


4  (7^  —  1)  a/tt  max(m  —  k  —  1)  7^ 

1 


2-1 


4  (72  —  1) 


27' 


2  \  max(j,Z) 


y2-1 


(59) 


is  nonnegative. 

Then,  there  exists  a  real  m  x  I  matrix  F  such  that 
||F  G  A  — yl||  <  ^/2l  max(j,  /)  7^  +  1  ak+i  +  \/2l  max(m  —  k  —  j,  1)  7^  +  1  (Tk+j+i  (60) 

and 

\\F\\<Vifi  (61) 

with  probability  not  less  than  the  amount  in  (59),  where  is  the  {k  +  1)®*  greatest  singular 
value  of  A,  and  Ck+j+i  is  the  {k  +  j  +  1)^*  greatest  singular  value  of  A. 

Proof.  We  prove  the  existence  of  a  matrix  F  satisfying  (60)  and  (61)  by  constructing  one. 
We  start  by  forming  an  SVD  of  A, 


A  =  UTV^,  (62) 

where  7/  is  a  real  unitary  mxm  matrix,  S  is  a  real  mxn  matrix  whose  entries  are  nonnegative 
everywhere  and  zero  off  of  the  main  diagonal,  and  P  is  a  real  unitary  n  x  n  matrix,  such 
that 

=  (Tj  (63) 

for  alH  =  1,  2,  . . . ,  min(m,n)  —  1,  min(m,n),  where  Sj  j  is  the  entry  in  row  i  and  column  i 
of  S,  and  a*  is  the  greatest  singular  value  of  A. 
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Next,  we  define  auxiliary  matrices  H,  R,  F,  and  P.  We  define  H  to  be  the  leftmost  I  x  k 
block  of  the  /  x  m  matrix  GU ,  i?  to  be  the  I  x  j  block  of  GU  whose  first  column  is  the 
{k  +  1)®*  column  of  G  U,  and  F  to  be  the  rightmost  lx  {m  —  j  —  k)  block  of  G  U,  so  that 

GU  =  {  H  \  R\  T  )  .  (64) 


Combining  the  facts  that  U  is  real  and  unitary,  and  that  the  entries  of  G  are  i.i.d.  Gaussian 
random  variables  of  zero  mean  and  unit  variance,  we  see  that  the  entries  of  H  are  also 
i.i.d.  Gaussian  random  variables  of  zero  mean  and  unit  variance,  as  are  the  entries  of  R,  and 
as  are  the  entries  of  F.  We  dehne  to  be  the  real  k  x  /  matrix  given  by  the  formula 

=  {H^  H)-^  .  (65) 


We  dehne  P  to  be  the  real  m  x  /  matrix  whose  uppermost  k  x  /  block  is  H^~^\  whose  entries 
in  the  j  x  /  block  whose  hrst  row  is  the  {k  +  1)®*  row  of  P  are  zero,  and  whose  entries  in  the 
lowermost  {m  —  k  —  j)  x  I  block  are  zero,  so  that 


P  = 


Finally,  we  dehne  F  to  be  the  m  x  I  matrix  given  by 


F  =  UP 


U 


//(-!) 

0 

0 


(66) 


(67) 


Combining  (65),  (9),  the  fact  that  the  entries  of  H  are  i.i.d.  Gaussian  random  variables 
of  zero  mean  and  unit  variance,  and  Lemma  14  yields 

W^\\<Vlp  (68) 


with  probability  not  less  than 


^/2tt  {I  —  k  +  1) 


k  +  l)/]) 


(69) 


Combining  (67),  (68),  and  the  fact  that  U  is  unitary  yields  (61). 

We  now  show  that  F  dehned  in  (67)  satishes  (60). 

We  dehne  S  to  be  the  leftmost  uppermost  k  x  k  block  of  S,  T  to  be  the  j  x  j  block  of  S 
whose  leftmost  uppermost  entry  is  the  entry  in  the  {k  +  1)®*  row  and  {k  +  1)®*  column  of  S, 
and  0  to  be  the  rightmost  lowermost  {m  —  k  —  j)  x  [n  —  k  —  j)  block  of  S,  so  that 


S 


0 

0  ' 

0 

T 

0 

0 

0 

0 

(70) 
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Combining  (62),  (64),  and  (67)  yields 


FGA-A=U 

Combining  (65)  and  (70)  yields 

if(-i) 

0 
0 

Furthermore, 


//(-!) 

0 

0 


H  \R\T  )  -1  S  1/^ 


0 

H^-^^RT 

i7(-i)  r  0  ' 

0 

-T 

0 

0 

0 

-0 

0 

H^-^^RT 

//(-!)  r  0  ' 

0 

-T 

0 

0 

0 

-0 

(71) 


(72) 


<  |/7<-'>7ir||'F||i/<-">re||'F||r||i  +  ||e||l  (73) 


Moreover, 


llfill  IIUI 

(74) 

and 

F  0||  < 

ril  lieil 

(75) 

Combining  (70)  and  (63)  yields 

ll^ll  <  (^k+i 

(76) 

and 

||0||  <  (^k+j+i- 

(77) 

Combining  (71),  (72),  (73),  (74),  (75),  (76),  (77),  and  the  fact  that  U  and  V  are  unitary 
yields 


||FG^  -  Af  <  (||i/'-'>|r  lliiir  +  1)  (fft+i)"  +  (||i/'-'>|r  llrf  +  1)  (ffi+,+i)".  (78) 

Combining  Lemma  13  and  the  fact  that  the  entries  of  R  are  i.i.d.  Gaussian  random 
variables  of  zero  mean  and  unit  variance,  as  are  the  entries  of  F,  yields 

\\R\\  <  \/2  max(j,/)  7  (79) 


and 

||F||  <  a/2  max(m  —  k  —  j,  1)  7,  (80) 

with  probability  not  less  than 


27^ 

4  (72  —  1)  i/tt  max(m  —  k  —  j,  1)  7^ 


2  \  max(m— /c— /, /) 


27^ 

4  (72  —  1)  i/tt  max(j,  /)  7^ 


2  \  max(yZ) 


•  (81) 
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Combining  (78),  (68),  (79),  and  (80)  shows  that 

||FG^-^P<  (2/ max(j, /)/3^7^+ 1)  ((Tfc+i)'^+ (2/ max(m  -  fc  -  +  1)  {Ok+j+if 

(82) 

with  probability  not  less  than  the  amount  in  (59).  Combining  (82)  and  the  fact  that 

^/x  +  y  <\/x  +  ^  (83) 

for  any  nonnegative  real  numbers  x  and  y  yields  (60).  □ 

Given  an  m  x  n  matrix  A,  and  a  matrix  G  whose  entries  are  i.i.d.  Gaussian  random 
variables  of  zero  mean  and  unit  variance,  the  following  lemma  provides  a  highly  probable 
upper  bound  on  the  singular  values  of  the  product  GA  in  terms  of  the  singular  values  of 
A]  the  lemma  is  most  useful  when  the  singular  values  of  A  decay  sufficiently  fast,  and  the 
numbers  j  and  I  in  the  lemma  are  both  much  less  than  m. 


Lemma  20  Suppose  that  j,  k,  I,  m,  and  n  are  positive  integers  with  k  <  I,  such  that 
k  +  j  <  m  and  k  +  j  <  n.  Suppose  further  that  A  is  a  real  m  x  n  matrix,  G  is  a  real  I  x  m 
matrix  whose  entries  are  i.i.d.  Gaussian  random  variables  of  zero  mean  and  unit  variance, 
and  J  is  a  positive  real  number,  such  that  7  >  1  and 


^  =  1  - 


4  (72  —  1)  max(m  —  k  —  j,  1)  7^  \e'>' 


27" 


2  \  max{m—k—j,l) 


v2_l 


4  (72  —  1)  max(/c  +  j,  1)  7^  \e'>' 


27' 


2  \  max(A:+j,  Z) 


y2_i 


(84) 


is  nonnegative. 

Then,  the  {k  +  1)**  greatest  singular  value  pk+i  of  G  A  is  at  most  a  certain  linear  combi¬ 
nation  of  the  {k  +  1)®*  greatest  singular  value  ak+i  of  A  and  the  {k-\-j-\- 1)®*  greatest  singular 
value  Ck+j+i  of  A,  namely. 


Pk+i  <  \/2  m&x{k  +  j,  /)  7  (Jk+i  +  4/2  max(m  -k-j,l)'y  ak+j+i,  (85) 

with  probability  not  less  than  the  amount  in  (Sf). 

Proof.  We  start  by  forming  an  SVD  of  A, 

A  =  UTV^,  (86) 

where  [/  is  a  real  unitary  mxm  matrix,  S  is  a  real  mxn  matrix  whose  entries  are  nonnegative 
everywhere  and  zero  off  of  the  main  diagonal,  and  P  is  a  real  unitary  n  x  n  matrix,  such 
that 

=  (Ji  (87) 

for  alH  =  1,  2,  . . . ,  min(m,n)  —  1,  min(m,n),  where  is  the  entry  in  row  i  and  column  i 
of  S,  and  a*  is  the  greatest  singular  value  of  A. 
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Combining  (86)  and  the  fact  that  V  is  unitary  yields  that  GA  has  the  same  singular 
values  as  GUT,. 

Next,  we  dehne  auxiliary  matrices  H  and  R.  We  dehne  H  to  be  the  leftmost  I  X  {k  +  j) 
block  of  the  /  x  m  matrix  G  U,  and  R  to  be  the  rightmost  I  x  [m  —  k  —  j)  block  of  G  U,  so 
that 

GU={H\R).  (88) 

Combining  the  facts  that  U  is  real  and  unitary,  and  that  the  entries  of  G  are  i.i.d.  Gaussian 
random  variables  of  zero  mean  and  unit  variance,  we  see  that  the  entries  of  H  are  also 
i.i.d.  Gaussian  random  variables  of  zero  mean  and  unit  variance,  as  are  the  entries  of  R. 


Combining  (88)  and  the  fact  that  G  A  has  the  same  singular  values  as  GUT  yields  that 
G  A  has  the  same  singular  values  as  (  hf  |  0  )  s+(o|i?)  s. 


It  follows  from  (87)  that 

(  0  i?  )  T\\  <  ||i?||  ak+j+i- 

(89) 

Combining  (17)  and  (89)  yields 

Pk+l  <  U+l  +  \\R\\  Ckk+j+l, 

(90) 

where  pk+i  is  the  {k  +  1)®*  greatest  singular  value  of(i70)S-|-(0 
the  (/c-l-l)®*  greatest  singular  value  of  (  77  0  )  S;  p^+i  is  also  the  (7-1-1) 
value  of  G  A,  since  G  A  has  the  same  singular  values  as(770)S-|-( 
Furthermore, 

IK  |o  )ll  <  ll^^ll- 

R  )  T,  and  r^+i  is 
greatest  singular 
0\R)  T. 

(91) 

Combining  (14),  (91),  and  (87)  yields 

'Tfc+i  <  1  -77|  (Jfc+i. 

(92) 

Combining  Lemma  13  and  the  fact  that  the  entries  of  77  are  i.i.d.  Gaussian  random 
variables  of  zero  mean  and  unit  variance,  as  are  the  entries  of  R,  shows  that 

||77||  <  ^2  max(k  +  j,l)  7 

(93) 

and 

\\R\\  <  \/2  max(m  —  k  —  j,l)  'j 

(94) 

with  probability  not  less  than  the  amount  in  (84). 

Combining  (90),  (92),  (93),  and  (94)  yields  (85). 

□ 

4  Description  of  the  algorithm 

In  this  section,  we  describe  the  algorithm  of  the  present  paper.  In  Subsection  4.1,  we  discuss 
approximations  to  interpolative  decompositions.  In  Subsection  4.2,  we  discuss  approxima¬ 
tions  to  SVDs.  In  Subsection  4.3,  we  tabulate  the  computational  costs  of  various  parts  of 
the  algorithm.  In  Subsection  4.4,  we  describe  Table  1. 
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4.1  Interpolative  decomposition 

Suppose  that  k,  m,  and  n  are  positive  integers  with  k  <m  and  k  <  n,  and  4  is  a  real  mxn 
matrix.  In  this  subsection,  we  will  collect  together  k  appropriately  chosen  columns  of  A  into 
a  real  m  x  k  matrix  B,  and  construct  a  real  k  x  n  matrix  P,  such  that 

||P||  <  ^Ak  {n-k)  +  l  (95) 

and 

\\BP-A\\<ak+i,  (96) 

where  a^+i  is  the  {k  +  1)®*  greatest  singular  value  of  A.  To  do  so,  we  select  an  integer  I  with 
I  >  k,  such  that  I  <  m  and  /  <  n  (/  =  /c  +  20  is  often  a  suitable  choice),  and  make  the 
following  three  steps: 

1.  Using  a  random  number  generator,  form  a  real  I  x  m  matrix  G  whose  entries  are 
i.i.d.  Gaussian  random  variables  of  zero  mean  and  unit  variance,  and  compnte  the 
I  X  n  product  matrix 

R  =  GA.  (97) 

2.  Using  the  algorithm  of  [6],  form  a  real  I  x  k  matrix  S  whose  columns  constitute  a  subset 
of  the  columns  of  P,  and  a  real  k  x  n  matrix  P  satisfying  (95),  such  that 

ll^P-Pil  <  ^Ak{n-k)  +  lpk+i,  (98) 

where  pk+i  is  the  {k  +  l)*^*  greatest  singular  value  of  R.  (See  Observation  3  for  a  brief 
discussion  of  the  properties  of  the  algorithm  of  [6].) 

3.  Using  the  fact  that  the  columns  of  S  constitute  a  subset  of  the  columns  of  P,  for  any 
j  =  1,  2,  . . . ,  /c  —  1,  /c,  let  ij  denote  an  integer  snch  that  the  column  of  S  is  the 
column  of  P.  Form  the  real  m  x  k  matrix  B  whose  column  is  the  column  of  A 
for  all  j  =  1,  2,  . . . ,  /c  —  1,  k. 

The  matrices  B  and  P  obtained  via  the  preceding  three  steps  satisfy  (95)  and  (96);  see  the 
following  observation. 

Observation  21  It  is  easy  to  see  that  the  matrices  B  and  P  satisfy  (96).  Indeed,  combin¬ 
ing  (97)  and  Remark  16  yields 

S  =  GB.  (99) 

Combining  (98),  (97),  and  (99)  yields 

\\GBP-GA\\  <  ^/4k  {n  -  k)  +  1  pk+i,  (100) 

where  pk+i  is  the  {k  +  1)®*  greatest  singnlar  value  of  P.  Suppose  that  f3  and  7  are  positive 
real  numbers  such  that  7  >  1  and 

v-1  ^  ^  I  (  Y"  nnu 

^  \J2t[  {I  —  k  +  1)  \(^~^  +  l)/5/  2  (7^  —  1)  a/  7rm7^  / 
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is  nonnegative.  Then,  combining  (20),  (40),  (41),  (95),  (100),  (14),  (97),  and  Lemma  13 
yields 

\\BP-A\\ 

<  7^  +  1  ^a/4A;  {n  —  k)  +  1  +  1  j  +  /5  7  V2lm  A/4Ar(n'--A)y'TTj  ak+i  (102) 

with  probability  not  less  than  y  defined  in  (101),  where  Ck+i  is  the  (/c  + 1)®*  greatest  singular 
value  of  A.  The  bound  (102)  is  a  precise  version  of  (96).  For  example,  choosing  (3  =  3/4, 
7^  =  5,  and  /  =  /c  +  20,  and  combining  (102)  and  (101),  we  obtain 

\\B  P  —  A\\  <  10  ^/k  {k  +  20)  mn  ak+i  (103) 

with  probability  not  less  than  1  —  10“^^.  Table  1  contains  similar  results  obtained  by  taking 
other  values  for  I  —  k,  P,  and  7. 

Observation  22  When  the  singular  values  of  A  decay  sufficiently  fast,  and  I  is  much  less 
than  m,  the  factors  \/2lmP‘^  7^  +  1  and  V2lm  in  (102)  are  much  larger  than  necessary. 
Indeed,  suppose  that  j  is  a  positive  integer  with  k  +  j  <  m  and  k  +  j  <  n,  and  P  and  7  are 
positive  real  numbers,  such  that  7  >  1  and  <h  +  T  >  1,  where  <h  is  defined  in  (59),  and  T  is 
defined  in  (84).  Then,  combining  (20),  (60),  (61),  (95),  (100),  (85),  and  (97)  yields 

\\BP  -  A\\<^ak+i+r]ak+j+i  (104) 

with  probability  not  less  than  $  +  4/  —  1,  where  <h  is  defined  in  (59),  4/  is  defined  in  (84), 
(Tfc+i  is  the  {k  +  1)®*  greatest  singular  value  of  A,  and  au+j+i  is  the  {k  +  j  +  1)®*  greatest 
singular  value  of  A,  and  where 

^  =  a/2/  max(j,  1)  72  -I-  1  ^a/4A;  {n  —  k)  +  1  +  ij 

+  /5  7  a/2/  max(/c  +  j,  /)  a/4/c  (n  —  /c)  +  1  (105) 


and 

7  =  a/2/  max(m  —  k  —  j,  1)  +  1  ^a/4A;  {n  —  k)  +  1  + 

+  PiV^  max(m  —  k  -  j,  /)  ^Ak  {n-k)  +  1.  (106) 

When  j,  k,  and  /  are  all  much  less  than  m,  clearly  ^  is  much  less  than  rj.  In  many  practical 
situations,  ^  is  greater  than  7]ak+j+i,  so  that  the  right-hand  side  of  (104)  is  effectively 
independent  of  m. 

Remark  23  If  we  choose  I  =  k  in  the  algorithm  of  the  present  subsection  (instead  of 
choosing  I  >  k),  then  we  must  replace  (98)  with  the  formula 

||^P-R||=0.  (107) 

All  other  aspects  of  the  algorithm  stay  the  same  in  the  case  that  I  =  k.  In  particular,  (102) 
and  (104)  hold  in  the  case  that  /  =  k,  too. 
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4.2  Singular  Value  Decomposition 

Suppose  that  k,  m,  and  n  are  positive  integers  with  k  <m  and  k  <  n,  and  4  is  a  real  mxn 
matrix.  In  this  subsection,  we  will  construct  an  approximation  to  an  SVD  of  A  such  that 

\\UJ:V^-A\\<ak+i,  (108) 

where  t/  is  a  real  m  x  k  matrix  whose  columns  are  orthonormal,  Id  is  a  real  n  x  k  matrix 
whose  columns  are  orthonormal,  S  is  a  diagonal  real  k  x  k  matrix  whose  entries  are  all 
nonnegative,  and  Ck+i  is  the  {k  +  1)®*  greatest  singular  value  of  A.  To  do  so,  we  select  an 
integer  I  with  I  >  k,  such  that  I  <  m  and  /  <  n  (/  =  /c  +  20  is  often  a  suitable  choice),  and 
make  the  following  hve  steps: 

1.  Using  a  random  number  generator,  form  a  real  I  x  m  matrix  G  whose  entries  are 
i.i.d.  Gaussian  random  variables  of  zero  mean  and  unit  variance,  and  compute  the 
I  X  n  product  matrix 

R  =  GA.  (109) 

2.  Using  an  SVD,  form  a  real  nx  k  matrix  Q  whose  columns  are  orthonormal,  such  that 
there  exists  a  real  k  x  I  matrix  S  for  which 

\\QS-R^\\<Pk+i,  (110) 

where  pk+i  is  the  {k  +  1)®*^  greatest  singular  value  of  R.  (See  Observation  5  for  details 
concerning  the  construction  of  such  a  matrix  Q.) 

3.  Compute  the  m  x  k  product  matrix 

T  =  AQ.  (Ill) 

4.  Form  an  SVD  of  T, 

T  =  USIU^,  (112) 

where  U  is  a  real  m  x  k  matrix  whose  columns  are  orthonormal,  IF  is  a  real  k  x  k 
matrix  whose  columns  are  orthonormal,  and  S  is  a  real  kxk  matrix  whose  entries  are 
all  nonnegative  and  zero  off  of  the  main  diagonal.  (See,  for  example.  Chapter  8  in  [15] 
for  details  concerning  the  construction  of  snch  an  SVD.) 

5.  Compnte  the  n  x  k  prodnct  matrix 


V  =  QW.  (113) 

The  matrices  U,  S,  and  V  obtained  via  the  preceding  hve  steps  satisfy  (108);  see  the  following 
observation. 

Observation  24  It  is  easy  to  see  that  the  matrices  U,  S,  and  V  satisfy  (108).  Indeed, 
snppose  that  f3  and  7  are  positive  real  numbers  such  that  7  >  1  and  y  dehned  in  (101)  is 
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nonnegative.  Then,  combining  (25),  (40),  (41),  (111),  (112),  (113),  (110),  (14),  (109),  and 
Lemma  13  yields 


\\UT.V^  -A\\  <  (2  +  1  +  2V^  ak+i  (114) 

with  probability  not  less  than  y  defined  in  (101),  where  Ck+i  is  the  (/c  + 1)®*  greatest  singular 
value  of  A.  The  bound  (114)  is  a  precise  version  of  (108).  For  example,  choosing  (3  =  3/4, 
7^  =  5,  and  I  =  k  +  20,  and  combining  (114)  and  (101),  we  obtain 

\\UT.  V'^  —  All  <  10  a/(/i;  +  20)  m  (115) 

with  probability  not  less  than  1  —  10“^^.  Table  1  contains  similar  results  obtained  by  taking 
other  values  for  I  —  k,  P,  and  7. 

Observation  25  When  the  singular  values  of  A  decay  sufficiently  fast,  and  I  is  much  less 
than  m,  the  factors  7^  +  1  and  y/2lm  in  (114)  are  much  larger  than  necessary. 

Indeed,  suppose  that  j  is  a  positive  integer  with  k  +  j  <  m  and  k  +  j  <  n,  and  P  and  7  are 
positive  real  numbers,  such  that  7  >  1  and  <h  +  T  >  1,  where  <F  is  defined  in  (59),  and  T  is 
defined  in  (84).  Then,  combining  (25),  (60),  (61),  (111),  (112),  (113),  (110),  (85),  and  (109) 
yields 

WU'LV'^  -  A\\  <^ak+i+ri(Jk+j+i  (116) 

with  probability  not  less  than  <h  +  T  —  1,  where  <h  is  defined  in  (59),  T  is  defined  in  (84), 
(Tfc+i  is  the  {k  +  1)®*  greatest  singular  value  of  A,  and  a^+j+i  is  the  {k  +  j  +  1)®*  greatest 
singular  value  of  A,  and  where 

^  =  2  a/2/  max(j,  1)  P"^  +  1  +  2  \/ 21  max(A;  +  j,  /)  /dy  (117) 

and 

7  =  2  a/2/  max(m  —  /c  —  j,  /)  7^  +  1  +  2  a/2/  max(m  —  k  —  j,  1)  P'y.  (118) 

When  j,  k,  and  /  are  all  much  less  than  m,  clearly  ^  is  much  less  than  rj.  In  many  practical 
situations,  ^  is  greater  than  rjak+j+i,  so  that  the  right-hand  side  of  (116)  is  effectively 
independent  of  m. 

Remark  26  If  we  choose  I  =  k  in  the  algorithm  of  the  present  subsection  (instead  of 
choosing  I  >  k),  then  we  must  replace  (110)  with  the  formula 

||Q^-R^||=0.  (119) 

All  other  aspects  of  the  algorithm  stay  the  same  in  the  case  that  I  =  k.  In  particular,  (114) 
and  (116)  hold  in  the  case  that  /  =  k,  too. 

4.3  CPU  time  and  memory  requirements 

In  this  subsection,  we  tabulate  the  numbers  of  floating-point  operations  and  words  of  memory 
required  by  the  algorithms  described  in  Subsections  4.1  and  4.2,  as  applied  once  to  a  matrix 

A. 
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4.3.1  Interpolative  decomposition 

The  algorithm  described  in  Subsection  4.1  incurs  the  following  costs  in  order  to  compute  an 
approximation  to  an  interpolative  decomposition  of  A: 

1.  Forming  R  in  (97)  requires  applying  to  I  vectors. 

2.  Computing  S  and  P  in  (98)  or  (107)  costs  0{lkn  log(n)). 

3.  Forming  B  in  (99)  requires  applying  A  io  k  vectors,  where  each  vector  has  a  single 
entry  of  1  and  n  —  1  entries  of  0. 

Summing  up  the  costs  in  Steps  1-3  above,  we  conclude  that  the  algorithm  of  Subsection  4.1 
costs 

CiD  =  k  ■  Ca  +  I  ■  +  0{lkn  log(n)),  (120) 

where  Ca  is  the  cost  of  applying  A  to  a  real  n  x  1  column  vector,  and  C^t  is  the  cost  of 

applying  A"^  to  a  real  m  x  1  column  vector. 

4.3.2  Singular  Value  Decomposition 

The  algorithm  described  in  Subsection  4.2  incurs  the  following  costs  in  order  to  compute  an 
approximation  to  an  SVD  of  A: 

1.  Forming  R  in  (109)  requires  applying  A"^  to  I  vectors. 

2.  Computing  Q  in  (110)  or  (119)  costs  0{Rn). 

3.  Forming  T  in  (111)  requires  applying  A  to  k  vectors. 

4.  Computing  the  SVD  (112)  of  T  costs  0{k‘^m). 

5.  Forming  V  in  (113)  costs  0{k‘^n). 

Summing  up  the  costs  in  Steps  1-5  above,  we  conclude  that  the  algorithm  of  Subsection  4.2 
costs 

C*svD  =  k  ■  Ca  +  I  ■  Cat^  +  0{k'^  m  1“^  n),  (121) 

where  Ca  is  the  cost  of  applying  A  to  a  real  n  x  1  column  vector,  and  C^t  is  the  cost  of 
applying  A"^  to  a  real  m  x  1  column  vector. 

Remark  27  We  observe  that  the  algorithm  of  the  present  paper  only  requires  applying  A 
to  k  vectors  and  A^  to  I  vectors;  it  does  not  require  explicit  access  to  the  individual  entries 
of  A.  This  consideration  can  be  important  when  the  entries  of  A  are  not  available  explicitly, 
but  instead  A  and  A"^  are  available  solely  in  the  form  of  procedures  for  their  applications 
to  arbitrary  vectors.  Often  such  procedures  for  applying  A  and  A'^  cost  much  less  than  the 
standard  procedure  for  applying  a  dense  matrix  to  a  vector. 

Remark  28  Without  any  further  analysis,  we  can  observe  that  the  cost  in  (120)  of  apply¬ 
ing  the  algorithm  of  Subsection  4.1  to  any  m  x  n  matrix  A  is  in  principle  less  than  the 
0{kmn  log(n))  cost  of  using  the  algorithm  discussed  in  Observation  3  directly  on  A,  pro¬ 
vided  that  I  is  sufficiently  less  than  k  log(n),  and  that  both  m  and  n  are  much  greater  than 
both  k  and  /. 
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4.4  Description  of  Table  1 

Tables  1. 1-1.6  provide  an  upper  bound  on  the  probability  that 

\\U^V^  -  A\\>  C^imak+i,  (122) 

where  U ,  S,  and  V  are  the  matrices  in  the  approximation  to  an  SVD  of  the  mx  n  matrix  A 
in  (114).  In  (122),  k  and  I  are  any  positive  integers  with  k  <  I,  such  that  I  <  m  and  I  <  n, 
(Tfc+i  is  the  {k  +  1)®*  greatest  singular  value  of  4,  and  (  takes  on  the  values  specihed  by  the 
penultimate  columns  of  the  tables.  The  quantity  Ili_k^i3^^  is  defined  by  the  formula 

Ai-k,i3,'f  =  1  —  X)  (123) 

where  y  is  dehned  in  (101),  and  I  —  k,  P,  and  7  take  on  the  values  specihed  by  the  hrst, 
second,  and  third  columns  of  the  tables.  The  quantity  (  is  specihed  by  P  and  7  via  (114). 
Please  note  that  depends  only  on  I  —  k,  P,  and  7,  and  provides  an  upper  bound 

that  is  otherwise  independent  of  k,  m,  n,  and  A]  (115)  provides  a  similar  bound.  When  the 
singular  values  of  A  decay  sufficiently  fast,  and  I  is  much  less  than  m,  the  factor  of  y/m  in 
the  right-hand  side  of  (122)  is  larger  than  necessary;  see  Observation  25  above. 

Remark  29  Due  to  (102),  the  quantity  Ili_k,i3,'y  dehned  in  (123)  also  provides  an  upper 
bound  on  the  probability  that 

\\B  P  —  A\\  >  (  Vklmn  ak+i,  (124) 

where  B  and  P  are  the  matrices  in  the  approximation  to  an  interpolative  decomposition  of 
the  mx  n  matrix  A  in  (102). 

5  Numerical  results 

In  this  section,  we  describe  the  results  of  hve  numerical  tests  of  the  algorithm  of  the  present 
paper.  Table  2  summarizes  the  numerical  output  of  the  examples  described  in  the  present 
section. 

Tables  2. 1-2.5  display  the  results  of  applying  the  algorithm  of  the  present  paper  once 
to  a  real  n  x  n  matrix  A,  for  the  indicated  values  of  n.  The  matrix  A  is  dehned  at  the 
end  of  the  present  section.  The  numbers  k  and  I  are  those  from  Section  4;  k  is  the  rank 
of  the  approximations  to  A,  and  /  is  the  number  of  rows  in  the  matrix  G  whose  entries 
are  i.i.d.  Gaussian  random  variables  of  zero  mean  and  unit  variance  (the  algorithm  uses  the 
product  G  A).  The  displayed  times  refer  to  the  seconds  of  CPU  time  used  by  the  algorithm  to 
compute  both  the  approximation  to  an  interpolative  decomposition  and  the  approximation 
to  an  SVD  of  A.  (Please  note  that  our  implementation  is  optimized  for  accuracy  and  for 
analyzing  the  numerical  properties  of  the  algorithm,  and  is  probably  not  very  efficient.)  The 
numbers  ak  and  ak+i  are  those  from  the  dehnition  of  A  below;  furthermore,  (Tk+i  appears  in 
the  bounds  (102)  and  (114)  on  the  errors  of  the  approximations.  The  number  (5id  is  the  norm 
of  the  difference  between  A  and  an  approximation  B  P  to  an  interpolative  decomposition  of 
A,  that  is, 

6iT,  =  \\BP-Al  (125) 
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where  the  matrices  B  and  P  are  those  from  (102).  The  number  (5svd  is  the  norm  of  the 
difference  between  A  and  the  approximation  U  S  to  an  SVD  of  A,  that  is, 

6syD  =  \\UEV^  -Al  (126) 

where  the  matrices  U,  S,  and  V  are  those  from  (114). 

We  dehne  (5rei.  max.  as  follows.  First,  we  dehne  auxiliary  vectors  A,  . . . ,  P,  and 
r^,  r^,  . . . ,  P ,  where  j  =  30  in  Examples  1  and  2,  j  =  90  in  Example  3,  j  =  6  in 
Example  4,  and  j  =  5  in  Example  5.  We  choose  the  test  vectors  . . . ,  P~^,  P  to  include 
a  variety  of  deterministic  and  random  vectors  (specihcally,  we  set  every  entry  of  A  to  be 
1,  and  use  a  random  number  generator  to  generate  t^,  . . . ,  P~^,  P  so  that  their  entries 
are  i.i.d.  Gaussian  random  variables  of  zero  mean  and  unit  variance).  For  any  i  =  1,  2,  . . . , 
j  —  1,  j,  we  dehne  r*  to  be  the  vector  resulting  from  the  application  of  B  P  —  A  to  P,  that  is, 

P  =  {BP-A)t\  (127) 


where  the  matrices  B  and  P  are  those  from  (102).  Then,  we  dehne  (5rei.max.  via  the  formula 


<^rel. 


max. 


max 


/ maXp— l,m  l(^  )pl\ 

V  maXg=i,2,...,n-l,n  \{P)q\  J  ’ 


(128) 


where  {P)p  is  the  entry  of  r*,  and  {P)q  is  the  entry  of  P. 

All  estimates  displayed  in  Table  2  are  the  maximum  values  obtained  from  three  indepen¬ 
dent  realizations  of  the  random  variables  involved. 

The  values  of  (5id  and  (5svd  displayed  in  Tables  2.2,  2.3,  2.4,  and  2.5  are  those  obtained  via 
the  power  method  for  estimating  the  norm  of  a  matrix,  after  the  estimates  stabilized  to  three 
signihcant  hgures.  The  values  of  (5id  and  (5svd  displayed  in  Table  2.1  are  those  obtained  after 
100  iterations  of  the  power  method.  The  estimates  of  (5id  and  (5svd  summarized  in  Table  2.1 
did  not  stabilize  to  three  signihcant  hgures  after  100  (or  any  other  number  of)  iterations, 
undoubtedly  due  to  round-oh. 

We  performed  all  computations  using  IEEE  standard  double-precision  variables,  whose 
mantissas  have  approximately  one  bit  of  precision  less  than  16  digits  (so  that  the  relative 
precision  of  the  variables  is  approximately  .2e-15).  We  ran  all  computations  on  a  2.8  GHz 
Pentium  Xeon  microprocessor  with  512  KB  of  L2  cache  and  2  GB  of  RAM.  We  compiled  the 
Fortran  77  code  using  the  Lahey-Fujitsu  compiler,  with  the  optimization  hag  — o2  enabled. 

In  our  implementation,  we  computed  SVDs  using  2-sided  plane  (Jacobi/Givens)  rota¬ 
tions  (see,  for  example,  Ghapter  8  in  [15]).  We  used  an  algorithm  based  upon  pivoted  “QR” 
decompositions  to  compute  the  matrices  S  and  P  in  (98)  and  (107)  (see,  for  example,  Ghap¬ 
ter  5  in  [15]  for  a  description  of  “Q  R”  decompositions,  and  [6]  for  further  details  regarding 
our  particular  implementation). 

In  Examples  1,  2,  and  3,  we  use  a  pseudorandom  number  generator  to  construct  real 
n  X  1  vectors  fR,  . . . ,  and  z/^,  z/^,  . . . ,  z/l“^,  z/-^,  such  that  their  entries  are  a 

realization  of  i.i.d.  Gaussian  random  variables  of  zero  mean  and  unit  variance,  with  j  =  20 
in  Examples  1  and  2,  and  j  =  60  in  Example  3.  We  orthonormalize  /i^,  pA,  . . . , 
via  the  Gram-Schmidt  process  with  reorthogonalization  (see,  for  example,  [3])  to  obtain  real 
n  X  1  vectors  u^,  v?,  . . . ,  ,  and  do  the  same  with  z/^,  z/^,  . . . ,  z/l“^,  to  obtain 
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real  n  x  1  vectors  v"^,  . . . ,  v^~^,  vK  We  denote  by  di,  a-2,  . . . ,  o-j  the  positive  real 
numbers  displayed  in  Figure  1  (we  use  the  numbers  in  Figure  1.1  for  Example  1,  the  numbers 
in  Figure  1.2  for  Example  2,  and  those  in  Figure  1.3  for  Example  3).  We  dehne  A  to  be  the 
n  X  n  matrix  given  by  the  formula 


A  =  (Tj 

i=l 


(129) 


Clearly,  the  rank  of  A  is  j.  Since  u^,  v?,  . . . ,  are  orthonormal,  as  are  . . . , 

,  the  singular  value  of  A  is  (Tj,  for  alH  =  1,  2,  . . . ,  j  —  1,  j.  Table  2  displays  the 
results  of  applying  the  algorithm  of  the  present  paper  to  A,  for  various  values  of  n  (Table  2.1 
displays  the  results  for  Example  1,  Table  2.2  displays  the  results  for  Example  2,  and  Table  2.3 
displays  those  for  Example  3). 

Example  4  is  designed  to  illustrate  that  factors  of  the  order  of  ^J4:k{n  —  k)  +  1  are 
necessary  in  bounds  such  as  (102)  and  (104).  This  example  is  identical  to  Examples  1,  2, 
and  3,  using  the  same  matrix  A  dehned  in  (129),  but  with  n  assumed  to  be  divisible  by  8, 
with  j  =  4,  and  using  the  numbers  ai,  (T2,  o's,  and  (T4  displayed  in  Figure  1.4  (ai  =  1,  (T2  =  1, 
(T3  =  .le-7,  and  <74  =  .le-7),  and  with  the  following  vectors  v},  v?,  u^,  and  v^,  v^, 


,1\T 


=  ^  (  1  1  •••  1  1  )  , 
.  /in  ^  / 


n 


n 


[u 


.4\T 


n 


(v^)^  = 


at  _ 

1 

)  — 

Vn 

-1 

-1 

1 

1  - 

at 

[v 

)  = 

(  2\ 

(V  ) 

1 

_  ( 

-  2  ^ 

(  4\T 

[v  ) 

V 

1  -1  1  -1 ), 


(130) 

(131) 


«3)T  =  ^(1  1-1-111-1  -1  ...  11-1-111-1-1), 

(132) 

1111-1-1-1-1), 

(133) 

(134) 

(135) 

(136) 

(137) 


A/n  -  1 


11...  1110), 


^=(0  0  ...  0  0  0  1), 

1-11-1...  1-11-100), 

=  ^(1  0  -1  0  0  0  ...  0  0), 

that  is, 

1.  (a)  every  entry  of  is  1/^/n, 

(b)  entries  1,  2,  . . . ,  n  —  2,  n  —  1  of  are  1/ yjn  —  1,  and  entry  n  of  is  0, 

2.  (a)  every  even  entry  of  is  — l/^n,  and  every  odd  entry  of  v?  is  1/^/n, 

(b)  entries  1,  2,  . . . ,  n  —  2,  n  —  1  of  are  0,  and  entry  n  of  is  1, 
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3.  (a)  the  first  pair  of  entries  of  is  l/y/n,  the  second  pair  of  entries  is  —1/y/n,  the 

third  pair  of  entries  is  1/^/n,  the  fourth  pair  of  entries  is  —1/y/n,  and  so  on  (with 
each  successive  pair  of  entries  alternating  sign), 

(b)  every  even  entry  of  except  for  entry  n  is  —l/\/n  —  2,  every  odd  entry  of 
except  for  entry  n  —  1  is  l/\/n  —  2,  and  entries  n  —  1  and  n  of  are  0, 


4.  (a)  the  first  quadruplet  of  entries  of  is  fhe  second  quadruplet  of  entries  is 

the  third  quadruplet  of  entries  is  fhe  fourth  quadruplet  of  entries 

is  —1/y/n,  and  so  on  (with  each  successive  quadruplet  of  entries  alternating  sign), 
and 


(b)  entry  1  of  is  1/v^,  entry  2  of  is  0,  entry  3  is  — 1/v^,  and  entries  4,  5,  ... , 
n  —  1,  n  are  0. 


For  this  example  (Example  4), 


(Ti  (n^)’’"  +  02  (n^) 


2\T 


1 

(n  -  1) 


1 

1 

1 

1 


\  1  1 


1  1  \ 
1  1  -y/n^ 

1  1  Vn- 1 
1  1 

1  1 

1  1  -Vn-1 / 


(138) 


Clearly,  u^,  v?,  u^,  and  u‘^  are  orthonormal,  as  are  v^,  v^,  and  Therefore,  the 

singular  value  of  A  is  Oi,  for  all  i  =  1,  2,  3,  4.  Table  2.4  displays  the  results  of  applying  the 
algorithm  of  the  present  paper  to  A,  for  various  values  of  n. 

Example  5  is  designed  to  illustrate  that  factors  of  the  order  of  ^/m  are  necessary  in 
bounds  such  as  (102)  and  (114).  In  this  example,  we  dehne  A  to  be  the  n  x  n  matrix  given 
by  the  formula 

A  =  uv^  +  al,  (139) 


where  o  =  .le-6,  and  u  and  v  are  the  real  n  x  1  column  vectors  given  by  the  formulae 


M^=(l  0  0  ...  0  0)  (140) 

and 

I  1  ...  1  1  )  .  (141) 

yn 

It  follows  from  (16)  that  the  greatest  singular  value  o^  of  A  is  equal  to  .le-6  for  k  =  2,  3, 
. . . ,  n  —  2,  n  —  1.  Table  2.5  displays  the  results  of  applying  the  algorithm  of  the  present 
paper  to  A,  for  various  values  of  n. 


Remark  30  All  numerical  data  that  we  have  examined  —  including  the  data  displayed  in 
Table  2,  as  well  as  the  data  from  further  experiments  —  appear  to  satisfy  the  bounds  (102), 
(104),  (114),  and  (116).  The  loss  of  precision  displayed  in  Table  2.1  as  n  increases  is  probably 
largely  due  to  round-off  (compare  Table  2.2),  whereas  the  loss  of  precision  in  ^id  displayed 


26 


in  Table  2.4  as  n  increases  suggests  that  any  bounds  such  as  (102)  and  (104)  must  contain 
factors  of  the  order  of  The  loss  of  precision  displayed  in  Table  2.5  as  n 

increases  suggests  that  any  bounds  such  as  (102)  and  (114)  must  contain  factors  of  the  order 
of  ^/m.  In  contrast,  in  many  practical  situations  the  bounds  mentioned  in  Observations  22 
and  25  are  effectively  independent  of  m. 
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Table  1  (See  Subsection  4.4.) 
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Tables  2. 1-2.4  (See  Section  5.) 
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Table  2.5  (See  Section  5.) 
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